Model Selection

Few-shot fine-tuning

# Few-shot fine-tuning

Swf Trained Model

This model is a fine-tuned image segmentation model based on mukesh3444/window_detection_model on the nagarajuthirupathi/indoor_window_detection_swf dataset, focusing on indoor window detection tasks.

Image Segmentation

nagarajuthirupathi

This is a text-to-image generation model based on the Flax framework, specifically designed for generating plush toy-style images.

Text-to-Image English

Segformer B0 Finetuned Morphpadver1 Hgo Coord

An image segmentation model fine-tuned based on nvidia/mit-b0, performing excellently on the NICOPOI-9/morphpad_coord_hgo_512_4class dataset

Image Segmentation

Light R1 32B DS

Light-R1-32B-DS is a near-SOTA level 32B mathematical model, fine-tuned based on DeepSeek-R1-Distill-Qwen-32B, achieving high performance with only 3K SFT data.

Large Language Model

XTTS V2 text-to-speech model fine-tuned on a custom Tunisian dataset

Speech Synthesis Arabic

Granite Timeseries Ttm R2

TinyTimeMixers (TTMs) are compact pretrained models for multivariate time-series forecasting open-sourced by IBM Research, starting from 1 million parameters, introducing the concept of 'miniature' pretrained models in time-series forecasting for the first time.

Urdu Text To Speech Tts

Urdu TTS model fine-tuned from microsoft/speecht5_tts with small training dataset (4,200 sentences). Commercial use requires retraining.

Speech Synthesis

Transformers Other

Speecht5 Base Cs Tts

This is a monolingual Czech SpeechT5 base model, pre-trained on 120,000 hours of Czech audio and a 17.5 billion-word text corpus, designed as a starting point for Czech TTS fine-tuning.

Speech Synthesis

Transformers Other

Florence 2 DocVQA

This is a version of Microsoft's Florence-2 model fine-tuned for 1 day using the Docmatix dataset (5% of the data) with a learning rate of 1e-6

Kosmos 2 PokemonCards Trl Merged

This is a multimodal model fine-tuned based on Microsoft's Kosmos-2 model, specifically designed for recognizing Pokemon names on Pokemon cards.

Transformers English

Llama 3 8b Patent Small Dataset

A model fine-tuned on a small dataset of 16,000 English translations of Korean patents based on Meta-Llama-3-8B-Instruct, for testing purposes only.

Large Language Model

Transformers English

Gemma 1.1 7b It Fictional Chinese V1

Chinese language model fine-tuned on the generator dataset based on google/gemma-1.1-7b-it

Large Language Model

Videomae Base Finetuned Subset

A video understanding model fine-tuned on an unknown dataset based on the MCG-NJU/videomae-base model, with an accuracy of 67.13%

Video Processing

Mms Spa Finetuned Colombian Monospeaker

This is a Spanish TTS model based on MMS, fine-tuned using the VITS architecture, requiring only 80-150 samples and 20 minutes of training time to generate Spanish speech with a Colombian accent.

Speech Synthesis

Transformers Spanish

Mms Spa Finetuned Argentinian Monospeaker

This is a fine-tuned model based on the MMS Spanish version, built using the VITS architecture, trained with only 80 to 150 samples in approximately 20 minutes.

Speech Synthesis

Transformers Spanish

Distil Ast Audioset Finetuned Cry

An audio classification model fine-tuned on the DonateACry dataset based on bookbot/distil-ast-audioset, designed for identifying infant cries

Audio Classification

Abap Nous Hermes

This is an ABAP programming language model fine-tuned based on Llama-2-7b-chat-hf, specifically designed for generating ABAP code

Large Language Model

Transformers English

Segformer Finetuned Ihc

Image segmentation model fine-tuned on the Isaacks/ihc_slide_tissue dataset based on nvidia/mit-b0 model

Image Segmentation

Digit Mask Unispeech Sat Base Ft

A voice processing model fine-tuned based on microsoft/unispeech-sat-base, specializing in digit masking tasks, with outstanding performance on evaluation sets.

Speech Recognition

Swin Tiny Patch4 Window7 224 Finetuned Eurosat

A vision model fine-tuned on an image folder dataset based on microsoft/swin-tiny-patch4-window7-224

Image Classification

Vit Base Railspace

A Vision Transformer model fine-tuned from google/vit-base-patch16-224-in21k, achieving 99.26% accuracy on the evaluation set

Image Classification

Donut Base Finetuned Latvian Receipts V2

A model based on the Donut architecture, specifically fine-tuned for Latvian receipt data

Text Recognition

Donut Base Finetuned Latvian Receipts

This model is a fine-tuned version of donut-base on a Latvian receipt dataset, primarily used for receipt image processing tasks

Text Recognition

Deit Tiny Patch16 224 Finetuned Og Dataset 10e

A lightweight image classification model based on the DeiT-tiny architecture, achieving 94.8% accuracy after fine-tuning on a custom image dataset

Image Classification

Whisper Medium Catalan

This is a speech recognition model fine-tuned on the Catalan Common Voice 11.0 dataset based on OpenAI Whisper Medium.

Speech Recognition

Transformers Other

Beit Base Patch16 224 Pt22k Ft22k Finetuned FER2013CKPlus 7e 05 Finetuned SFEW 7e 05

A vision Transformer model based on the BEiT architecture, fine-tuned on FER2013CKPlus and SFEW datasets for facial expression recognition tasks.

Image Classification

Vit Base Patch16 224 In21k Lcbsi

A fine-tuned model based on Google Vision Transformer (ViT) architecture, suitable for image classification tasks

Image Classification

Swin Tiny Patch4 Window7 224 Finetuned Eurosat

This is a tiny version of the image classification model based on the Swin Transformer architecture, fine-tuned on the EuroSAT dataset, suitable for remote sensing image classification tasks.

Image Classification

Convnext Small 224 Leicester Binary

An image classification model fine-tuned for binary classification tasks based on facebook/convnext-small-224, achieving an F1 score of 0.9620 on the evaluation set

Image Classification

Convnext Tiny 224 Finetuned On Unlabelled IA With Snorkel Labels

This is a computer vision model based on the ConvNeXt-Tiny architecture, fine-tuned on unsupervised data using pseudo-labels generated by Snorkel

Image Classification

Bart Base Few Shot K 128 Finetuned Squad Seed 4

A question-answering model based on the BART-base architecture, fine-tuned on the SQuAD dataset, suitable for reading comprehension tasks.

Question Answering System

Convnext Tiny 224 Finetuned

This model is a fine-tuned version based on facebook/convnext-tiny-224, primarily used for image classification tasks, demonstrating excellent performance on the evaluation set.

Image Classification

Vit Base Patch16 224 In21k Wwwwii

A vision classification model fine-tuned based on Google's Vision Transformer (ViT) foundation model, suitable for image classification tasks

Image Classification

Swin Tiny Patch4 Window7 224 Finetuned Eurosat Kornia

A fine-tuned image classification model based on the Swin Transformer architecture, achieving 98.3% accuracy on the image folder dataset.

Image Classification

Swin Tiny Patch4 Window7 224 Finetuned Eurosat

A fine-tuned image classification model based on Swin Transformer architecture, achieving 93.94% accuracy on the EuroSAT dataset

Image Classification

Swin Tiny Patch4 Window7 224 Finetuned Eurosat

Fine-tuned image classification model based on Swin Transformer architecture, achieving 98% accuracy on the EuroSAT dataset

Image Classification

HekmatTaherinejad

Swin Tiny Patch4 Window7 224 Finetuned Eurosat

A fine-tuned image classification model based on Swin Transformer Tiny architecture, achieving 97.26% accuracy on the EuroSAT dataset

Image Classification

This model is a fine-tuned BEiT base version on the CIFAR-10 dataset, focusing on image classification tasks, achieving 99.18% accuracy on the evaluation set.

Image Classification

Swin Tiny Patch4 Window7 224 Finetuned Eurosat

This is a fine-tuned model based on the Swin Transformer Tiny architecture, specifically designed for image classification tasks, achieving an accuracy of 97.59% on the evaluation set.

Image Classification

Swin Tiny Patch4 Window7 224 Finetuned Eurosat

A fine-tuned image classification model based on the Swin Transformer architecture, achieving 97.44% accuracy on the image folder dataset

Image Classification

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase